Proteome-scale characterisation of motif-based interactome rewiring by disease mutations

Whole genome and exome sequencing are reporting on hundreds of thousands of missense mutations. Taking a pan-disease approach, we explored how mutations in intrinsically disordered regions (IDRs) break or generate protein interactions mediated by short linear motifs. We created a peptide-phage display library tiling ~57,000 peptides from the IDRs of the human proteome overlapping 12,301 single nucleotide variants associated with diverse phenotypes including cancer, metabolic diseases and neurological diseases. By screening 80 human proteins, we identified 366 mutation-modulated interactions, with half of the mutations diminishing binding, and half enhancing binding or creating novel interaction interfaces. The effects of the mutations were confirmed by affinity measurements. In cellular assays, the effects of motif-disruptive mutations were validated, including loss of a nuclear localisation signal in the cell division control protein CDC45 by a mutation associated with Meier-Gorlin syndrome. The study provides insights into how disease-associated mutations may perturb and rewire the motif-based interactome.

Thank you again for submitting your work to Molecular Systems Biology.We have now heard back from the three reviewers who agreed to evaluate your study.As you will see below, the reviewers are overall quite positive about the relevance of the study for the field.They do however raise a series of concerns which we would ask you to address in a revision.I think that the issues raised by the reviewers are rather clear and seem straightforward to address.All issues raised need to be satisfactorily addressed.Please feel free to contact me in case you would like to discuss in further detail any of the concerns raised.I would be happy to schedule a call.
On a more editorial level, we would ask you to address the following points: -The keywords need to be reduced to 5.
-Please provide a .docversion of the manuscript text (including legends for the main figures) and individual production quality figure files for the main Figures (one file per figure).Tables should be included in the manuscript text (at the very end), together with their description/legend.
-We have replaced Supplementary Information by the Expanded View (EV format).In this case, all additional figures can be included in a PDF called Appendix.Appendix figures should be labeled and called out as: "Appendix Figure S1, Appendix Figure S2... Appendix Table S1..." etc.Each legend should be below the corresponding Figure/Table in the Appendix.Please include a Table of Contents in the beginning of the Appendix.For detailed instructions regarding expanded view please refer to our Author Guidelines: .
-Tables S1-S11 should be provided as Datasets EV1-EV11.Please provide one file per EV Dataset.Each file should include the description of the EV Dataset in a separate tab.
-Please provide a "standfirst text" summarizing the study in one or two sentences (approximately 250 characters), three to four "bullet points" highlighting the main findings and a "synopsis image" (550px width and max 400px height, jpeg format) to highlight the paper on our homepage.
-All Materials and Methods need to be described in the main text.We would encourage you to use 'Structured Methods', our new Materials and Methods format.According to this format, the Materials and Methods section should include a Reagents and Tools Table (listing key reagents, experimental models, software and relevant equipment and including their sources and relevant identifiers) followed by a Methods and Protocols section in which we encourage the authors to describe their methods using a step-by-step protocol format with bullet points, to facilitate the adoption of the methodologies across labs.More information on how to adhere to this format as well as downloadable templates (.doc or .xls)for the Reagents and Tools Table can be found in our author guidelines: .An example of a Method paper with Structured Methods can be found here: .
-Please include a "Disclosure and Competing Interests Statement" in the main text.
-Please include a Data availability section describing how the data, code etc. have been made available.This section needs to be formatted according to the example below: The datasets and computer code produced in this study are available in the following databases: -Chip-Seq data: Gene Expression Omnibus GSE46748 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE46748) -Modeling computer scripts: GitHub (https://github.com/SysBioChalmers/GECKO/releases/tag/v1.0)-[data type]: [full name of the resource] [accession number/identifier] ([doi or URL or identifiers.org/DATABASE:ACCESSION])-For data quantification: please specify the name of the statistical test used to generate error bars and P values, the number (n) of independent experiments (specify technical or biological replicates) underlying each data point and the test used to calculate p-values in each figure legend.The figure legends should contain a basic description of n, P and the test applied.Graphs must include a description of the bars and the error bars (s.d., s.e.m.).reported mutations.
Minor points page 5, third line: the grammar of this sentence does not seem right page 5, section title: reveal, not reveals page 5, second paragraph: I suggest rephrasing for clarity: We found that the peptide-containing prey proteins shared GO terms related to the GO categories subcellular localization, molecular function, and biological processes with ... (I had to read the sentence multiple times and look at the figure to understand what you were referring to.) page 5, second paragraph towards the end: Your statement about the different number of interactors found for G3BP1/2 in the previous screen and the one reported here makes me wonder about the sensitivity of the screens and whether you can estimate the fraction of mutations screened whose effect you simply failed to detect due to limited sensitivity of the phage display screens.If you can estimate this, it would be nice to add it to the manuscript.If you cannot estimate it, it would still be nice to briefly discuss this in the discussion section.page 6 section title: selection and not selections page 7, second paragraph: There are more than 100 domain-mutation pairs for which no motif consensus sequence could be identified in the peptide.Quite some of these are also shown in the little networks drawn in Fig 4c-f and Fig 5c-d.How can they be interpreted?Do they point to extensions of the known motif consensus or alternative binding modes for the domains or to false positives?It would be nice if the authors were able to do more analyses on these or at least speculate on their interpretation in the discussion section.page 8, first line of 2nd paragraph: the "that" is too much, write "which interactions are lost and ..." page 8, 2nd paragraph: The authors report at the beginning of the section that interactions functioning in the ubiquitin system are more commonly lost by disease mutations while interactions functioning in the autophagy system are more often enabled, for example.The pie charts shown in page 12, discussion: You state that "many of the deregulataed interactions are thought to map to SLiM-based interfaces..." I would friendly disagree with this statement.I think that it is very unclear to which extent mutations in disordered regions are disease-causing because they for example fall into SLiMs.Many scientists are still uninformed about the many roles of disordered regions in proteins for protein function.As a consequence most mutations falling in disordered regions remain uncharacterized while mutations falling in domains are more readily called pathogenic (this is based on own observations and knowing how variant effect predictors work).I would suggest to reconsider this statement and potentially rephrase.This is also exactly one of the highlights of the manuscript as a first experimental study that shows the potential of deregulation of SLiMdomain interactions in disease.

Fig 2e:
The coloring of the wild-type vs mutant title of the figure confuses me.Aren't the blue and red dots representing enrichments or depletions of mutant vs wild-type peptide reads?If I am not mistaken then I would not color the title this way or change to "disabling vs enhancing mutations".I also find the log 10 on the y-axis non-intuitive.Wouldn't a log 2 make more sense for visualization purposes?Fig 6b and 6c: Please add to the legend an explanation of the node shape.Why did you decide to show known interactions between the preys?I would have found it more interesting to see which of the bait-prey interactions were known before.I would encourage the authors to submit their interaction data to IntAct as a public repository for protein interaction data.

Reviewer #2:
Advancements in whole genome and exome sequencing have yielded rapid discovery of human missense variants which greatly outpaces current methods to functionally annotate them.Many of these variants map to intrinsically disordered regions (IDRs) which harbor most of the proteome's short linear motifs (SLiMs).SLiMs are short interaction modules 3 -10 amino acids long which regulate key cellular processes such as localization, transactivation, complex formation and so on.However, their generally low binding affinities have made SLiM-based interactions difficult to capture in existing high-throughput protein-protein interaction assays.Previously, the authors established large-scale proteomic peptide phage display (ProP-PD) for proteomewide discovery of motif-mediated interactions.To understand the extent to which single nucleotide variants (SNVs) can exert SLiM network rewiring, the authors expanded their ProP-PD library to include disease-associated mutations, generating 12,301 unique single nucleotide variants (SNVs) across the IDRs of 1,915 prey proteins.
The authors interrogated binding of both wild-type and mutant prey peptides against 80 bait protein domains and identified 275 mutations to modulate SLiM-based PPIs, with 367 unique domain-mutation pairs and roughly equal proportions either diminishing or enhancing interactions.By mapping mutations back to the prey motif consensus sequence within which baits bound, they found that mutations in key residues diminished interactions whereas motif-creating mutations enabled more interactions.Frequently altered interactions with E3 ligases and proteins involved in trafficking and scaffolding point to dysregulation of protein abundance and localization, respectively, as common features underlying disease progression.
Overall, the rationale for the study is sound, the experiments are well conducted, and the authors' conclusions are generally supported by the data.The authors are careful not to overstate the significance of their findings, which I find refreshing!The study will be of high interest to many scientists working on variant annotation, network biology, protein/protein interactions, and rare diseases.I think the study is suitable for publication in MSB but I suggest the authors address the points raised below.They should be relatively easy to address.

Major points
1. Phage display is an incredibly powerful assay for large-scale studies, but one of the main downsides is that it is purely an in vitro assay with purified proteins and phage-displayed peptides.Inevitably, there is always an underlying question of how the results translate to cells.The authors are careful with their interpretation and validate two interaction-disrupting or interactiondiminishing mutations in cells.
I commend the authors for including their negative results with interaction-enabling mutations, since it would have been easy to just leave those negative results out (which I suspect happens too often in proteomics validation studies).It is indeed possible, as the authors state, that the interactions may not be strong enough in the cellular context.Alternatively, the disordered region may not be easily accessible in living cells.However, I think it would be still worth it to try alternative assays with some of them.For example, increased binding to MAP3LC3B could increase autophagy-dependent turnover of BUB1 S492F, which could be assessed by western blotting after cycloheximide treatment +/-autophagy inhibition.For CTNNB1 S33F, it would be interesting to know if the mutant localizes to G3BP1-containing stress granules.This could be done by treating cells with arsenite prior to assessing CTNNB1 localization by immunofluorescence or using GFP fusions.
If indeed it turns out that no interaction-enabling mutations can be validated with full-length proteins, the authors should be cautious about wording throughout the manuscript.The concept of interaction-enabling mutations is a good one here.But for example, the last sentence of the abstract ("The study provides a panoramic view of how disease-associated mutations perturb and rewire the motif-based interactome.") is rather strong given that not a single variant potentially increasing existing interactions or creating a novel interactions was validated.
2. Were the two interaction-disrupting interactions and three interaction-enhancing interactions the only ones the authors attempted to validate?3. I am not sure if I understand the analysis that was done for Figure 3B.The authors compare Mann-Whitney p-values for each category and derive another p-value from one-way ANOVA.Is it not possible to compare the actual values in each category instead of p-values?For example, values could be the fraction of variants whose interaction is affected by mutations in the key residue, flanking residues, or wildcard positions.Or alternatively, the mutation enrichment score as in Figure 3A.The same comment applies to Figure 3C with Grantham scores.Perhaps I do not understand the analysis correctly, but at least the authors should explain their rationale more clearly.4. "Of the PPIs, 106 have been previously reported in public databases and out of those 36 were also found in our previous study".Is this overlap statistically significant? 5. "Finally, we note that relaxing the p-value cut-off to p-value {less than or equal to} 0.01 would increase the numbers to 854 domain-mutation pairs affected by mutation with half being promoted (429) and half being diminished (425) by the mutation."Another way to define an appropriate threshold could be by using a benchmark set and quantifying precision/recall against the benchmark at different score thresholds.
6. Were the benign variants equally disruptive of interactions as pathogenic variants were? 7. How large were the peptides?What was the tiling window size? 8. What determined the selection of bait proteins?9. Finally, the authors discuss the possibility of drugging motif-based interactions, and it is indeed an exciting avenue -although very challenging.In the discussion, the authors could also mention molecular glues that can modulate existing protein/protein interaction interfaces or create entirely new ones.A great example is a recent study showing that the motif-based interaction between mutant beta-catenin and beta-TrCP E3 ligase can be stabilized with a small molecule (PMID 30926793).

Minor comments
1. Fig 1A : Single nucleotide variances -> Single nucleotide variants 2. In Figure 2B, instead of (or in addition to) pie graphs it would be useful to show the relative fraction of each category in the original pool, in domain-region interactions, and in perturbed interaction pairs.The same comment applies to Figure 4A, as it is difficult to compare two pie charts to each other (especially with so many categories).
3. Many plots have log10(p) as the Y axis, but how the p-value was derived is not indicated in the plot.4. When you first reference the GenVar library, it is not explicitly introduced.It would be good to state that the "novel phage library" is the GenVar_HD2. 5. I'm wondering if the statement that 22% of disease-associated missense mutations map to IDRs is still valid, given that the original estimates were generated prior to resources like ClinVar and gnomAD.Are there more recent estimates?Reviewer #3: In this study, the authors explore the role of mutations in intrinsically disordered regions on short linear motif-dependent proteinprotein interactions.They apply a previously utilized phage-display prey-bait methodology to this problem, and demonstrate clear effects of many disease causing mutations (somatic and germline).They confirm their findings in these short-sequence phage screens with select intact protein binding measurements, as well as in cell function assays.This is an excellent study that adds to the growing body of knowledge regarding the effects of disease-causing genetic variants on protein-protein interactions and their functional consequences.The proteomics and genetics communities will be most interested in these results, and will likely use them to expand our understanding of protein-protein interactome function in health and disease, as well as to identify potential novel therapeutic targets for complex diseases.

Comments for the authors follow:
--On page 3, paragraph 2, please comment on the frequency of disease-causing mutations localized to protein-protein binding interface sites vs. non-interface sites for somatic and germline mutations.
--How does the degree of the mutant protein in the PPI affect the importance (likelihood) of short linear motif-dependent binding variations on pathogenicity?--For proteins with degree greater than 1, how does a short linear motif mutation that affects one binding interaction affect other binding interactions (if at all)? Are there allosteric consequences to these mutations?Addressing this question with one or two examples would be helpful. 1

Reviewer #1:
Kliche et al report on the results of a phage display screen for 80 motif-binding domains against a peptide library that was designed to cover disordered regions in human proteins that carry missense mutations that are associated with disease.Of note, the peptide library contained wild type and mutant peptides enabling a mapping of mutations that would decrease or increase binding to any of the presented 80 domains.To the best of my understanding this is the largest and also most systematic investigation of the effect of disease mutations falling in short linear motifs (SLiMs) on the binding of folded domains.The authors report on close to 400 domain-mutation pairs that significantly altered binding based on a stringent cutoff of which about half increase and the other half decreased binding.Findings were further validated in orthogonal assays using fluorescence polarization with domains and peptides as well as coIP with full length proteins and microscopy.The study suggests that alterations of domain-SLiM mediated protein interactions might contribute to the manifestation of a diverse set of genetic diseases.The study also provides first ideas for possible molecular mechanisms underlying the pathogenicity of reported mutations.

Reply:
We thank the reviewer for appreciating the novelty of our study.

Minor points
Comment 1: page 5, third line: the grammar of this sentence does not seem right Reply: We have corrected this.
Comment 2: page 5, section title: reveal, not reveals Reply: We have corrected this, thank you for spotting it.
Comment 3: page 5, second paragraph: I suggest rephrasing for clarity: We found that the peptide-containing prey proteins shared GO terms related to the GO categories subcellular localization, molecular function, and biological processes with ... (I had to read the sentence multiple times and look at the figure to understand what you were referring to.) Reply: Thanks for pointing this out, we have rephrased the sentence.
Comment 4: page 5, second paragraph towards the end: Your statement about the different number of interactors found for G3BP1/2 in the previous screen and the one reported here makes me wonder about the sensitivity of the screens and whether you can estimate the fraction of mutations screened whose effect you simply failed to detect due to limited sensitivity of the phage display screens.If you can estimate this, it would be nice to add it to the manuscript.If you cannot estimate it, it would still be nice to briefly discuss this in the discussion section.
Reply: Good point.Based on the available knowledge of SLiM-based interactions we calculated the recall of known SLIM-based interactions from the phage selections using the 30th Apr 2024 1st Authors' Response to Reviewers GenVar-HD2 library and found it to be 19.9%, which is very similar to the previously estimated recall of 19.5% for selections against the HD2 library.While we cannot directly calculate the fraction of mutations screened whose effect we fail to nail (simply because there is not sufficient data to use for comparison) we can estimate that we will miss 80% of the interactions, and hence likely in the order of 80% of the cases for which the mutations have an effect.However, this is a rough estimate as some interactions are perturbed by many mutations (that is mutational hotspots) while others are only affected by one or few mutations.We have added the recall estimate to the results (page 5), and also added the topic to the discussion (page 13).
Comment 5: page 6 section title: selection and not selections Reply: We have corrected this.Thank you.
Comment 6: page 7, second paragraph: There are more than 100 domain-mutation pairs for which no motif consensus sequence could be identified in the peptide.Quite some of these are also shown in the little networks drawn in Fig 4c-f and Fig 5c-d.How can they be interpreted?Do they point to extensions of the known motif consensus or alternative binding modes for the domains or to false positives?It would be nice if the authors were able to do more analyses on these or at least speculate on their interpretation in the discussion section.
Reply: Thank you for raising this point.It made us realise a mistake in our analysis so that we missed annotating motifs in a number of peptides.We have now updated the analysis (See Dataset EV6) and included the motif information used for each bait protein (See Dataset EV2D-E).There are still 68 domain-mutation pairs for which there was no match with the previously annotated (or here defined) motifs.From an ocular inspection it appears as if some of the cases are missed due to overly defined motifs, or variations of the motifs and potentially the presence of variant motifs.There may of course also be some false positives.We have added a comment on this to the discussion (page 16).
Comment 7: page 8, first line of 2nd paragraph: the "that" is too much, write "which interactions are lost and ..."

Reply:
We have corrected this.Thank you.
Comment 8: page 8, 2nd paragraph: The authors report at the beginning of the section that interactions functioning in the ubiquitin system are more commonly lost by disease mutations while interactions functioning in the autophagy system are more often enabled, for example.The pie charts shown in Fig 4a I thought were not very helpful to visually support this statement and it certainly does not allow for a quantification.If possible, it would be nice to compute enrichment scores to support the statements.
Reply: Thank you for raising this point.We replaced the pieplots with a back-to-back barplot (Figure 4A) that we find better illustrates the observed variations.Comment 10: page 11, first couple of lines: Can you quantify your statement that no disease category was overrepresented in terms of mutations that showed an effect on domain-binding?Fig 6a is not very helpful to visually support this statement.

Reply:
We agree that the figure did not support the expressed point in a convenient way.We therefore compiled the values and added the information to Dataset EV6B for an easy overview.The results show that there was only an increased/decreased representation of 3 disease categories after selection as compared to in the library.The peptides with mutations associated with cardiovascular/hematologic disappeared after selection, possibly because there was no domain binding to these peptides as bait in our selection.For the peptides with mutations associated with immune disease there was a minor reduction (32%) of representation after selection.In contrast, the fraction of mutations associated with reproductive disorders doubled after selections.However, in most cases, these mutations had no effect on binding as judged by the phage data, and the increase in the representation of the mutations of reproductive disorders likely has more to do with the fact that several of these peptides contain SH3 binding motifs, and we have several bait SH3 domains.Thus, the observed effect likely represents a sampling bias.
Comment 11: page 12, discussion: You state that "many of the deregulated interactions are thought to map to SLiM-based interfaces..." I would friendly disagree with this statement.I think that it is very unclear to which extent mutations in disordered regions are disease-causing because they for example fall into SLiMs.Many scientists are still uninformed about the many roles of disordered regions in proteins for protein function.As a consequence most mutations falling in disordered regions remain uncharacterized while mutations falling in domains are more readily called pathogenic (this is based on own observations and knowing how variant effect predictors work).I would suggest to reconsider this statement and potentially rephrase.This is also exactly one of the highlights of the manuscript as a first experimental study that shows the potential of deregulation of SLiM-domain interactions in disease.

Reply:
We thank the reviewer for pointing this out.We have rephrased the sentence following the constructive suggestion.
Comment 12: Fig 2e : The coloring of the wild-type vs mutant title of the figure confuses me.Aren't the blue and red dots representing enrichments or depletions of mutant vs wild-type peptide reads?If I am not mistaken then I would not color the title this way or change to "disabling vs enhancing mutations".I also find the log 10 on the y-axis non-intuitive.Wouldn't a log 2 make more sense for visualization purposes?
Reply: We agree with the reviewer and have adjusted the title of the plot and the scale of the y-axis according to the comment above.We hope that the visualisation is now clearer.
Comment 13: Fig 6b and 6c: Please add to the legend an explanation of the node shape.Why did you decide to show known interactions between the preys?I would have found it more interesting to see which of the bait-prey interactions were known before.
Reply: Thank you for pointing out the lack of information.We added an explanation for the node shape (square for bait proteins, octagonal for prey proteins) to the figure legend.We visualised the prey-prey protein interactions to provide a structured framework to the network, which we find useful.But we agree that the information on previously reported bait-prey interactions should be integrated in the network and have modified the figure accordingly (green dotted line for the previously reported interactions).Comment 15: I would encourage the authors to submit their interaction data to IntAct as a public repository for protein interaction data.
Reply.We fully agree and have deposited the results in IntAct (ID: IM-30020).We have added the text "The protein interactions from this publication have been submitted to the IMEx (http://www.imexconsortium.org)consortium through IntAct (Del Toro et al., 2022) and assigned the identifier IM-30020" to the Data Availability section.

Reviewer #2:
Advancements in whole genome and exome sequencing have yielded rapid discovery of human missense variants which greatly outpaces current methods to functionally annotate them.Many of these variants map to intrinsically disordered regions (IDRs) which harbor most of the proteome's short linear motifs (SLiMs).SLiMs are short interaction modules 3 -10 amino acids long which regulate key cellular processes such as localization, transactivation, complex formation and so on.However, their generally low binding affinities have made SLiM-based interactions difficult to capture in existing high-throughput protein-protein interaction assays.Previously, the authors established large-scale proteomic peptide phage display (ProP-PD) for proteome-wide discovery of motif-mediated interactions.To understand the extent to which single nucleotide variants (SNVs) can exert SLiM network rewiring, the authors expanded their ProP-PD library to include disease-associated mutations, generating 12,301 unique single nucleotide variants (SNVs) across the IDRs of 1,915 prey proteins.
The authors interrogated binding of both wild-type and mutant prey peptides against 80 bait protein domains and identified 275 mutations to modulate SLiM-based PPIs, with 367 unique domain-mutation pairs and roughly equal proportions either diminishing or enhancing interactions.By mapping mutations back to the prey motif consensus sequence within which baits bound, they found that mutations in key residues diminished interactions whereas motifcreating mutations enabled more interactions.Frequently altered interactions with E3 ligases and proteins involved in trafficking and scaffolding point to dysregulation of protein abundance and localization, respectively, as common features underlying disease progression.
Overall, the rationale for the study is sound, the experiments are well conducted, and the authors' conclusions are generally supported by the data.The authors are careful not to overstate the significance of their findings, which I find refreshing!The study will be of high interest to many scientists working on variant annotation, network biology, protein/protein interactions, and rare diseases.I think the study is suitable for publication in MSB but I suggest the authors address the points raised below.They should be relatively easy to address.

Reply:
We thank the reviewer for the expressed interest in our study and for the helpful suggestions.

Major points
Comment 1. Phage display is an incredibly powerful assay for large-scale studies, but one of the main downsides is that it is purely an in vitro assay with purified proteins and phagedisplayed peptides.Inevitably, there is always an underlying question of how the results translate to cells.The authors are careful with their interpretation and validate two interactiondisrupting or interaction-diminishing mutations in cells.
I commend the authors for including their negative results with interaction-enabling mutations, since it would have been easy to just leave those negative results out (which I suspect happens too often in proteomics validation studies).It is indeed possible, as the authors state, that the interactions may not be strong enough in the cellular context.Alternatively, the disordered region may not be easily accessible in living cells.However, I think it would be still worth it to try alternative assays with some of them.For example, increased binding to MAP3LC3B could increase autophagy-dependent turnover of BUB1 S492F, which could be assessed by western blotting after cycloheximide treatment +/-autophagy inhibition.For CTNNB1 S33F, it would be interesting to know if the mutant localizes to G3BP1-containing stress granules.This could be done by treating cells with arsenite prior to assessing CTNNB1 localization by immunofluorescence or using GFP fusions.
If indeed it turns out that no interaction-enabling mutations can be validated with full-length proteins, the authors should be cautious about wording throughout the manuscript.The concept of interaction-enabling mutations is a good one here.But for example, the last sentence of the abstract ("The study provides a panoramic view of how disease-associated mutations perturb and rewire the motif-based interactome.") is rather strong given that not a single variant potentially increasing existing interactions or creating a novel interactions was validated.
Reply: We performed the experiments suggested by the reviewer.For the CTNNB1 experiment, we transiently transfected HeLa cells with either YFP-CTNNB1 wild-type, YFP-CTNNB1 S33F mutant or YFP-G3BP1 and performed live cell microscopy of the cells in response to 0.5 mM arsenite treatment.While YFP-G3BP1 localises nicely to stress granules upon treatment and as previously reported (PMID: 38177924), neither YFP-CTNNB1 wild-type nor YFP-CTNNB1 S33F mutant changed their localisation (Figure A).The experiment was performed in a single biological repeat.The expression of the CTNNB1 constructs was challenging, as observed previously, but we observed the same behaviour in at least ten cells per construct (wild-type: ca. 10 cells, mutant: ca.20 cells).We hence conclude that arsenite treatment does not result in the recruitment of CTNNB1 wild-type or mutant to stress granules.
For the BUB1 experiment, we used HeLa cell lines stably expressing Flag-BUB1 wild-type or Flag-BUB1 mutant and performed a cycloheximide (300 µg/mL) chase experiment either with or without preceding autophagy blockage by bafilomycin A (100 nM) for ca.20 h.Cells were collected at time point 0 min, 30 min, 60 min, 120 min and 240 min of the chase experiment.
The experiment was evaluated by immunoblotting for endogenous GAPDH, endogenous CDC20 and the stably expressed Flag-BUB1 constructs (Figure B).First, we judge that the chase experiment has worked due to the decreasing levels of CDC20 with time in the cell, which were not treated with bafilomycin A. The chase of the BUB1 constructs is less obvious, presumably due to the stable expression of the BUB1 constructs, but quantification of the blot demonstrates the degradation at 240 min (Figure C).In a previous experiment, we tried to extend the chase to 360 min but the cells started to die at that time point, particularly when treated with bafilomycin A. From the quantification, it can be suggested that the BUB1 S492F mutant is degraded slightly faster and more linearly than the BUB1 wild-type, however that is irrespective of the bafilomycin A treatment (Figure B).From that, we conclude that the potential differential degradation behaviour is independent of the autophagic pathway.The experiment was performed as a single repeat.Following the suggestion of the reviewer we have softened the last sentence of the abstract.
Comment 2: Were the two interaction-disrupting interactions and three interaction-enhancing interactions the only ones the authors attempted to validate?
Reply: We tested the effect of one more mutation on the interaction in the context of the fulllength proteins, namely the effect of the P348L SQSTM1 mutation on the binding to MAP1LC3B (ATG8).It is a known interaction (PMID: 17580304), which was suggested by the GenVar selection results to be enhanced by the mutation.There is also one study in support of the observation, where they stated based on a co-IP experiment: "The affinity of p62 P348L to LC3-II was also higher" (PMID: 31362587).However, we found in affinity measurements that the mutation had no impact on binding (Appendix Figure S5) and in the co-IPs we had inconsistent results between the triplicates.Due to these conflicting results with the literature and our own experiments, we decided to not show the blotting results (the affinity experiments are included in the manuscript) but we can provide them if wished for.
Comment 3. I am not sure if I understand the analysis that was done for Figure 3B.The authors compare Mann-Whitney p-values for each category and derive another p-value from one-way ANOVA.Is it not possible to compare the actual values in each category instead of p-values?For example, values could be the fraction of variants whose interaction is affected by mutations in the key residue, flanking residues, or wildcard positions.Or alternatively, the mutation enrichment score as in Figure 3A.The same comment applies to Figure 3C with Grantham scores.Perhaps I do not understand the analysis correctly, but at least the authors should explain their rationale more clearly.
Reply: After considering the reviewers comment we agree that the conducted analysis was sub-optimal.We also found errors in our motif annotation pipeline based on feedback from Reviewer 1, an issue we have now solved.We updated the data in Dataset EV6 and re-built Figure 3 to improve visualisation.In the new panel 3B we included an enrichment analysis that shows the effect of mutations on the interactions (diminishing/enhancing) depending on if the mutation alters an existing motif consensus (if any), or if the mutation creates a new consensus.The new panel 3C is a re-build of the previous panel D but with fixed annotations that now focuses only on those mutations that modulate the interactions (diminishing/enhancing).
Comment: 4. "Of the PPIs, 106 have been previously reported in public databases and out of those 36 were also found in our previous study".Is this overlap statistically significant?
Reply: Yes, the overlap is significant.To establish this, we first, we updated our PPI datasets for human proteome (December-2023).Our PPI datasets include experimentally validated protein-protein interactions obtained from the IntAct, HIPPIE, HURI, Bioplex and STRING databases.Next, we calculated the total number of possible unique protein-protein pairs, excluding self-binding, considering all baits used in the study and all preys in the GenVar library design (total: 147147 PPIs).Afterwards, we annotated how many of those proteinprotein pairs have been characterised in our PPI dataset (known PPIs: 4638).We observed that from 147147 possible protein-protein pairs, 4638 have been reported as interacting proteins in databases, and we used it as a background in our comparison to our selection results.In the current study, we observed 1229 possible unique high-/medium-confidence interactions at the protein-protein level, from which 100 were previously reported as experimentally validated PPIs in the aforementioned databases.Based on observed values in this study (100 known out of 1229) compared to the background library settings (4638 known out of 147147), we obtained an enrichment of 2.58 (p-value < 1x10 -20 , hypergeometric test).
Comment 5. "Finally, we note that relaxing the p-value cut-off to p-value {less than or equal to} 0.01 would increase the numbers to 854 domain-mutation pairs affected by mutation with half being promoted ( 429) and half being diminished (425) by the mutation."Another way to define an appropriate threshold could be by using a benchmark set and quantifying precision/recall against the benchmark at different score thresholds.
Reply: That is true.We have previously used a benchmark set to define the criteria for differentiating between true positive hits (that is, binding peptides) and background noise.However, for the analysis we are doing here, that is mapping the effects of the mutations on binding, there is no appropriate benchmarking set, and we thus resorted to the presented strategy and the experimental validations.
Comment 6. Were the benign variants equally disruptive of interactions as pathogenic variants were?
Reply: That is an interesting question.However, our library comprises a much higher fraction of pathogenic than benign mutations (compare Figure 1C), so that the read-out is biased towards finding interactions perturbed by pathogenic mutations.Our selection results report on 1347 individual mutations, of which we find 275 to modulate 367 domain-mutation interactions.These can be mapped back to their clinical categorisation (benign, conflicting interpretation, pathogenic, no information available).

(2.8%) -
We conclude based on our yet limited data the benign and pathogenic mutations are equally likely to have a disruptive effect on the interactions observed with the test set of bait proteins used.
Comment 7: How large were the peptides?What was the tiling window size?
Reply: The peptides are 16 amino acids long with a 12 amino acid overlap.We have now specified this in the method section.
Comment 8. What determined the selection of bait proteins?
Reply: We explained the selection on page 5 (Bait collection and phage selections), but are happy to clarify here too.The rationale for the bait domains was guided first to include domains of proteins, which have been reported to be implicated in cancer (cancer-associated proteins) based on two previously reported studies (PMID: 34591613, PMID: 33558758).Secondly, we included domains the interactions of which were reported to be disrupted by genetic variation, equally based on a previous report (PMID: 31515488).Lastly, we aimed to cover a variety of (known) peptide-binding domains, which were partially screened previously against other phage libraries (HD2 and/or HD2_PM).The rationale for this is that mutational phage display is greatly facilitated by previous knowledge of binding motifs.
Comment 9. Finally, the authors discuss the possibility of drugging motif-based interactions, and it is indeed an exciting avenue -although very challenging.In the discussion, the authors could also mention molecular glues that can modulate existing protein/protein interaction interfaces or create entirely new ones.A great example is a recent study showing that the motif-based interaction between mutant beta-catenin and beta-TrCP E3 ligase can be stabilized with a small molecule (PMID 30926793).

Reply:
We agree, and we have expanded the discussion in this direction.

Minor comments
Comment 10.Comment 11.In Figure 2B, instead of (or in addition to) pie graphs it would be useful to show the relative fraction of each category in the original pool, in domain-region interactions, and in perturbed interaction pairs.The same comment applies to Figure 4A, as it is difficult to compare two pie charts to each other (especially with so many categories).
Reply: We agree that it was difficult to compare the data in 2B and we have therefore improved the clarity and added the percentages.However, it is not possible to add a comparison to the original pool, as we do not know how many binding sites are available in the library.We have also changed the layout of Figure 4A.
Comment 12: Many plots have log10(p) as the Y axis, but how the p-value was derived is not indicated in the plot.
Reply: Thank you for pointing this out.The y-axes of the plots have now been modified to indicate the statistical test (Mann-Whitney test) from which the p-values have been derived (Figure 2, 3 and 4).
Comment 13: When you first reference the GenVar library, it is not explicitly introduced.It would be good to state that the "novel phage library" is the GenVar_HD2.
Reply: This is correct, and the text has been modified accordingly.
Comment 14: I'm wondering if the statement that 22% of disease-associated missense mutations map to IDRs is still valid, given that the original estimates were generated prior to resources like ClinVar and gnomAD.Are there more recent estimates?

Reply:
The number of about 20% disease-associated mutations mapping to the IDRs seems to remain accurate.For example, a more recent study showed that about 20% of cancer drivers are mutated in the IDRs (PMID: 33806614).We have added this reference to the manuscript.

Reviewer #3:
In this study, the authors explore the role of mutations in intrinsically disordered regions on short linear motif-dependent protein-protein interactions.They apply a previously utilized phage-display prey-bait methodology to this problem, and demonstrate clear effects of many disease causing mutations (somatic and germline).They confirm their findings in these shortsequence phage screens with select intact protein binding measurements, as well as in cell function assays.This is an excellent study that adds to the growing body of knowledge regarding the effects of disease-causing genetic variants on protein-protein interactions and their functional consequences.The proteomics and genetics communities will be most interested in these results, and will likely use them to expand our understanding of proteinprotein interactome function in health and disease, as well as to identify potential novel therapeutic targets for complex diseases.
Reply: Thank you very much for appreciating our study.

Comments for the authors follow:
Comment 1: On page 3, paragraph 2, please comment on the frequency of disease-causing mutations localized to protein-protein binding interface sites vs. non-interface sites for somatic and germline mutations.
Reply: Good point.We have added the information that disease-causing somatic and germline mutations more frequently are localised to protein-protein binding interfaces than to non-interface sites, as elegantly shown in the cited reference.
Comment 2: How does the degree of the mutant protein in the PPI affect the importance (likelihood) of short linear motif-dependent binding variations on pathogenicity?

Reply:
The acquisition bias caused by screening against a fixed collection of protein domain makes it difficult to answer the question.However, if we focus on our generated PPI network we see that over half of the mutation containing proteins (preys) have a degree of 1: When looking at the proportion of the mutations for each prey that we found were modulating an interaction (that is diminishing or enhancing it) there is no correlation between the degree of the prey and the proportion of interactions perturbed by mutations (R 2 = 0.0002).This is also true when looking at the numbers from the bait degrees perspective (R 2 = 0.0021).
If instead basing the analysis of the degrees of the mutant proteins on information available in publicly available databases, the degrees are higher (preys mean degree = 118, preys median degree = 90) but the correlations are still low (prey R 2 = 0.0004; bait R 2 = 0.0005).Thus Finally, it can be noted that our library design was highly biased towards disease-associated mutations (see reply to Reviewer 2, comment 6), but that we did not see any differences in terms of interaction perturbing effects between pathogenic and benign variants.
Comment 3: For proteins with degree greater than 1, how does a short linear motif mutation that affects one binding interaction affect other binding interactions (if at all)? Are there allosteric consequences to these mutations?Addressing this question with one or two examples would be helpful.
Reply: This is an interesting question, which we realise we did not really address before.We have now added a section on the topic in the discussion.A single disease-associated mutation may affect binding to multiple binding partners.In the simple case, a SLiM, such as a WW domain binding PPxY motif, or a LIR binding to MAP1LC3s, may have the propensity to bind to several members of a given domain family.In such cases, a mutation will affect binding to all members of the family.SLiMs may also be densely located in the IDRs and may sometimes even overlap.In such cases, there are several potential outcomes.The effect may be to abrogate the distinct binding events.We found for example an APC W2658L mutation that reduces the binding to both CSKP and to GABARAPL1.Alternatively, a mutation may affect binding to one SLiM, while leaving the other interaction unaffected.For example, we found a SQSTM1 P348L mutation (334-GDDDWTHLSSKEVD(P/L)STGELQSL-356) to not affect the binding of the LIR motif (WxxL) by the MAP1LC3B ATG domain, but to confer a 25-fold decrease in affinity of the TGE motif for KEAP1 KELCH domain.Finally, a mutation may simultaneously abrogate binding to one protein, while creating a novel binding site for another protein, as demonstrated for the beta-catenin S33F mutation, which is famous for abrogating binding to β-TrCP and which we found to create a novel G3BP1/2 NTF2-like domain binding site.In many cases, the mutated motifs are located in distinct regions of the mutated protein, and consequently mutation of one binding site will not directly affect binding to the other proteins as the binding events are independent.However, even if the binding sites and interactions per se are independent, there may be changes of the interactomes of the mutant protein.For example, the loss of an NLS will lead to cytosolic localization of the protein, and hence loss of binding with nuclear interaction partners (e.g.CDC45).
Regarding allostery, we are focusing on SLiMs found in the intrinsically disordered regions and we typically don't expect allosteric consequences as this would typically be transmitted through a folded structure.However, it is possible that some of the mutated regions make self interactions with the rest of the protein, and that these potential interactions may be affected by the mutations.Thank you again for sending us your revised manuscript.We are now satisfied with the modifications made and I am pleased to inform you that your paper has been accepted for publication.Your manuscript will be processed for publication by EMBO Press.It will be copy edited and you will receive page proofs prior to publication.Please note that you will be contacted by Springer Nature Author Services to complete licensing and payment information.
You may qualify for financial assistance for your publication charges -either via a Springer Nature fully open access agreement or an EMBO initiative.Check your eligibility: https://www.embopress.org/page/journal/17444292/authorguide#chargesguideShould you be planning a Press Release on your article, please get in contact with embo_production@springernature.com as early as possible in order to coordinate publication and release dates.
If you have any questions, please do not hesitate to contact the Editorial Office.Thank you for your contribution to Molecular Systems Biology.
Yours sincerely, Sincerely, Poonam Bheda, PhD Scientific Editor Molecular Systems Biology ------->>> Please note that it is Molecular Systems Biology policy for the transcript of the editorial process (containing referee reports and your response letter) to be published as an online supplement to each paper.If you do NOT want this, you will need to inform the Editorial Office via email immediately.More information is available here: https://www.embopress.org/transparentprocess#Review_Process

EMBO Press Author Checklist USEFUL LINKS FOR COMPLETING THIS FORM
The EMBO Journal -Author Guidelines EMBO Reports -Author Guidelines Molecular Systems Biology -Author Guidelines EMBO Molecular Medicine -Author Guidelines Please note that a copy of this checklist will be published alongside your article.

Abridged guidelines for figures 1. Data
The data shown in figures should satisfy the following conditions: New materials and reagents need to be available; do any restrictions apply?Yes Reagents and Tools Table.

Antibodies
Information included in the manuscript?
In which section is the information available?
(Reagents and Tools For antibodies provide the following information: -Commercial antibodies: RRID (if possible) or supplier name, catalogue number and or/clone number -Non-commercial: RRID or citation

Yes
Reagents and Tools table.

DNA and RNA sequences
Information included in the manuscript?
In which section is the information available?
(Reagents and Tools Cell lines: Provide species information, strain.Provide accession number in repository OR supplier name, catalog number, clone number, and/OR RRID.

Reagents and Tools table
Primary cultures: Provide species, strain, sex of origin, genetic modification status.

Not Applicable
Report if the cell lines were recently authenticated (e.g., by STR profiling) and tested for mycoplasma contamination.

Not Applicable
No. The cells were not tested for mycoplasma contamination.

Experimental animals Information included in the manuscript?
In which section is the information available?
(Reagents and Tools ). a statement of how many times the experiment shown was independently replicated in the laboratory.
-common tests, such as t-test (please specify whether paired vs. unpaired), simple χ2 tests, Wilcoxon and Mann-Whitney tests, can be unambiguously identified by name only, but more complex techniques should be described in the methods section; Please complete ALL of the questions below.Select "Not Applicable" only when the requested information is not relevant for your study.
if n<5, the individual data points from each experiment should be plotted.Any statistical test employed should be justified.Source Data should be included to report the data underlying figures according to the guidelines set out in the authorship guidelines on Data Each figure caption should contain the following information, for each panel where they are relevant: a specification of the experimental system investigated (eg cell line, species name).the assay(s) and method(s) used to carry out the reported observations and measurements.an explicit mention of the biological and chemical entity(ies) that are being measured.an explicit mention of the biological and chemical entity(ies) that are altered/varied/perturbed in a controlled manner.

Study protocol
Information included in the manuscript?
In which section is the information available?
(Reagents and Tools Include a statement about sample size estimate even if no statistical methods were used.

Not Applicable
Were any steps taken to minimize the effects of subjective bias when allocating animals/samples to treatment (e.g.randomization procedure)?If yes, have they been described?

Not Applicable
Include a statement about blinding even if no blinding was done.

Not Applicable
Describe inclusion/exclusion criteria if samples or animals were excluded from the analysis.Were the criteria pre-established?
If sample or data points were omitted from analysis, report if this was due to attrition or intentional exclusion and provide justification.

Yes
Result section.We exluded a set of bait proteins from the analysis as there was no signifcant enrichment of binding phages.
For every figure, are statistical tests justified as appropriate?Do the data meet the assumptions of the tests (e.g., normal distribution)?Describe any methods used to assess it.Is there an estimate of variation within each group of data?Is the variance similar between the groups that are being statistically compared?

Yes
Sample definition and in-laboratory replication Information included in the manuscript?
In which section is the information available?
(Reagents and Tools

Reporting
Adherence to community standards Information included in the manuscript?
In which section is the information available?
(Reagents and Tools Have primary datasets been deposited according to the journal's guidelines (see 'Data Deposition' section) and the respective accession numbers provided in the Data Availability Section?

Data Availability Section
Were human clinical and genomic datasets deposited in a public accesscontrolled repository in accordance to ethical obligations to the patients and to the applicable consent agreement?

Not Applicable
Are computational models that are central and integral to a study available without restrictions in a machine-readable form?Were the relevant accession numbers or links provided?

Not Applicable
If publicly available data were reused, provide the respective data citations in the reference list.

Not Applicable
The MDAR framework recommends adoption of discipline-specific guidelines, established and endorsed through community initiatives.Journals have their own policy about requiring specific guidelines and recommendations to complement MDAR.
-scale characterisation of protein motif interactome rewiring by disease mutations Dear Ylva, Fig 4a I thought were not very helpful to visually support this statement and it certainly does not allow for a quantification.If possible, it would be nice to compute enrichment scores to support the statements.page 8, 2nd paragraph: the reference to Fig 4b misses the 4 page 11, first couple of lines: Can you quantify your statement that no disease category was overrepresented in terms of mutations that showed an effect on domain-binding?Fig 6a is not very helpful to visually support this statement.

Fig
Fig 6d and 6e.I honestly don't find these last analyses very useful and helpful.If there is space limitations one could remove this last part.

Comment 9 :
page 8, 2nd paragraph: the reference to Fig 4b misses the 4 Reply: We have corrected this.Thank you.

Comment 14 :
Fig 6d and 6e.I honestly don't find these last analyses very useful and helpful.If there is space limitations one could remove this last part.Reply: We respectfully keep the Fig 6d and 6e as it might be of interest to some readers, and links to the potential druggability of these interactions.Reviewer 2 found this part interesting.

Figure B :
Figure B: Blot of the cycloheximide chase experiment of the BUB1 wild-type and S492 mutant with and without prior Bafilomycin A treatment.

Figure C :
Figure C: Quantification of the cycloheximide chase experiment with the stable Flag-BUB1 wild-type and mutant cell lines, with and without Bafilomycin A treatment.
Fig 1A: Single nucleotide variances -> Single nucleotide variants Reply: Thank you for pointing this out, it has now been corrected.
-scale characterisation of motif-based interactome rewiring by disease mutations Dear Dr Ivarsson,

In which section is the information available?
definitions of statistical methods and measures: (Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section) Table, Materials and Methods, Figures, Data Availability Section)

Short novel DNA or RNA including primers, probes: provide
Table, Materials and Methods, Figures, Data Availability Section)

Cell materials Information included in the manuscript? In which section is the information available?
(Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section)

Human research participants Information included in the manuscript? In which section is the information available?
Table, Materials and Methods, Figures, Data Availability Section) (Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section)If collected and within the bounds of privacy constraints report on age, sex and gender or ethnicity for all study participants.

In which section is the information available?
This checklist is adapted from Materials Design Analysis Reporting (MDAR) Checklist for Authors.MDAR establishes a minimum set of requirements in transparent reporting in the life sciences (see Statement of Task: 10.31222/osf.io/9sm4x).Please follow the journal's guidelines in preparing your the data were obtained and processed according to the field's best practice and are presented to reflect the results of the experiments in an accurate and unbiased manner.
(Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section)If your work benefited from core facilities, was their service mentioned in the acknowledgments section?Yes Methods and materials and in the acknowledgments section.

Checklist for Life Science Articles (updated January ideally
, figure panels should include only measurements that are directly comparable to each other and obtained with the same assay.plots include clearly labeled error bars for independent experiments and sample sizes.Unless justified, error bars should not be shown for technical the exact sample size (n) for each experimental group/condition, given as a number, not a range; a description of the sample collection allowing the reader to understand whether the samples represent technical or biological replicates (including how many animals, litters, cultures, etc.

In which section is the information available?
Table, Materials and Methods, Figures, Data Availability Section) If study protocol has been pre-registered, provide DOI in the manuscript.For clinical trials, provide the trial registration number OR cite DOI.(Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section) Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section) (

In which section is the information available?
Table, Materials and Methods, Figures, Data Availability Section)In the figure legends: state number of times the experiment was replicated in laboratory.Include a statement confirming that informed consent was obtained from all subjects and that the experiments conformed to the principles set out in the WMA Declaration of Helsinki and the Department of Health and Human Services Belmont Report.
(Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section)Studies involving human participants: State details of authority granting ethics approval (IRB or equivalent committee(s), provide reference number for approval.Not ApplicableStudies involving human participants:Not ApplicableStudies involving human participants: For publication of patient photos, include a statement confirming that consent to publish was obtained.

Use Research of Concern (DURC) Information included in the manuscript? In which section is the information available?
(Reagents and ToolsTable, Materials and Methods, Figures, Data Availability Section) Could your study fall under dual use research restrictions?Please check biosecurity documents and list of select agents and toxins (CDC): https://www.selectagents.gov/sat/list.htmNot Applicable If you used a select agent, is the security level of the lab appropriate and reported in the manuscript?Not Applicable If a study is subject to dual use research of concern regulations, is the name of the authority

granting approval and reference number for
the regulatory approval provided in the manuscript?

and III randomized controlled trials
Table, Materials and Methods, Figures, Data Availability Section) State if relevant guidelines or checklists (e.g., ICMJE, MIBBI, ARRIVE, PRISMA) have been followed or provided.Not Applicable For tumor marker prognostic studies, we recommend that you follow the REMARK reporting guidelines (see link list at top right).See author guidelines, under 'Reporting Guidelines'.Please confirm you have followed these guidelines., please refer to the CONSORT flow diagram (see link list at top right) and submit the CONSORT checklist (see link list at top right) with your submission.See author guidelines, under 'Reporting Guidelines'.Please confirm you have submitted this list.Reagents and Tools Table, Materials and Methods, Figures, Data Availability Section) (